Prediction of Hospital Charges for the Cancer Patients with Data Mining Techniques
نویسندگان
چکیده
Objective:Predictions of hospital charges for cancer patients are very important, because they provide a basis for allocating medical resources in the hospital and for establishing national medical policies. But previous studies to predict hospital charges were mainly based on statistical analysis, which has used only a small aspect among huge medical data so that the prediction power was limited. Thus we developed four data mining models, including two artificial neural network (ANN) models and two classification and regression tree (CART) models, to predict both the total amount of hospital charges and the amount paid by the insurance of cancer patients and compared their efficacies. Methods:The data was generated from 400,625 medical records of 1,605 cancer patients who had been hospitalized toKyungHeeUniversity Hospital fromMarch 1, 2003 to February 29, 2004. Clementine 8.1 programwas used to build four data mining prediction models, two for the total amount and two for the amount paid by insurance. The variables included all of the data fields of standard medical record form of Korea. The neural network model used feed-forward back propagation method, which had 2 hidden layers. For decision tree model, RELIEFF method was used and the maximum tree depth was set to 30.We divided the dataset into 67%of training dataset and 33%of test dataset, using stratified sampling. Linear correlation coefficient and gain chart were compared. Results: The ANN models showed better linear correlation coefficient than the CART models in predicting both the total amount (0.824 vs. 0.791) and the amount paid by insurance (0.838 vs. 0.699). The estimated accuracy of ANN model was more than 98% to predict both total amount and amount paid by insurance. The CART model for total amount showed that the relative importance of the variables were duration of admission(0.073), number of consultation(0.061), and treatment group 16(0.06). The CART model for the amount paid by insurance showed that the relative importance of the cariables were duration of admission (0.09), number of ICU admission (0.063), and number of consultations (0.062). The percent gain of ANN model shows better %gain than CART to predict total amount but to predict amount paid by insurance, ANN showed similar pattern to CART Conclusion:The ANNmodels showed better prediction accuracy than CART models. However, the CART models, which serve different information from ANN model, can be used to allocate limited medical resources effectively and efficiently. For the purpose of establishing medical policies and strategies, using those models together is warranted. (Journal of Korean Society of Medical Informatics 15-1, 13-23, 2009)
منابع مشابه
Detection of Breast Cancer Progress Using Adaptive Nero Fuzzy Inference System and Data Mining Techniques
Prediction, diagnosis, recovery and recurrence of the breast cancer among the patients are always one of the most important challenges for explorers and scientists. Nowadays by using of the bioinformatics sciences, these challenges can be eliminated by using of the previous information of patients records. In this paper has been used adaptive nero fuzzy inference system and data mining techniqu...
متن کاملComparison of Hospital Charge Prediction Models for Colorectal Cancer Patients: Neural Network vs. Decision Tree Models
Analysis and prediction of the care charges related to colorectal cancer in Korea are important for the allocation of medical resources and the establishment of medical policies because the incidence and the hospital charges for colorectal cancer are rapidly increasing. But the previous studies based on statistical analysis to predict the hospital charges for patients did not show satisfactory ...
متن کاملUsing data mining techniques for predicting the survival rate of breast cancer patients: a review article
This review was conducted between December 2018 and March 2019 at Isfahan University of Medical Sciences. A review of various studies revealed what data mining techniques to predict the probability of survival, what risk factors for these predictions, what criteria for evaluating data mining techniques, and finally what data sources for it have been used to predict the surv...
متن کاملThe prediction of lymphedema via the combination of the selected data mining algorithms
Background: Breast cancer is the second leading cause of cancer death in women, after lung cancer. Due to the importance of predicting this disease, the use of data mining methods in medical research is more significant than before. Data mining algorithms can be a great help in preventing the development of lymphedema in patients. The aim Of this study was to create a diagnosis system that can ...
متن کاملUsing Combined Descriptive and Predictive Methods of Data Mining for Coronary Artery Disease Prediction: a Case Study Approach
Heart disease is one of the major causes of morbidity in the world. Currently, large proportions of healthcare data are not processed properly, thus, failing to be effectively used for decision making purposes. The risk of heart disease may be predicted via investigation of heart disease risk factors coupled with data mining knowledge. This paper presents a model developed using combined descri...
متن کاملتعیین عوامل موثر در بروز سرطان معده با استفاده از رویکرد داده کاوی
Background and Aim: Gastric cancer is the second leading cause of cancer death in the world. Due to the prevalence of the disease and the high mortality rate of gastric cancer in Iran, the factors affecting the development of this disease should be taken into account. In this research, two data mining techniques such as Apriori and ID3 algorithm were used in order to investigate the effective f...
متن کامل